Skip to content

DOC: add sections about big new features (CoW, string dtype) to 3.0.0 whatsnew notes #61724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jorisvandenbossche
Copy link
Member

We don't actually yet list the bigger features (string dtype, CoW, no silent downcasting) in the 3.0.0 whatsnew page, so starting to do that here.

Already pushed a section about string dtype, will further add a section about CoW and the downcasting.


Starting with pandas 3.0, a dedicated string data type is enabled by default
(backed by PyArrow under the hood, if installed, otherwise falling back to
NumPy). This means that pandas will start inferring columns containing string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NumPy). This means that pandas will start inferring columns containing string
``object``-dtype backed by NumPy). This means that pandas will start inferring columns containing string

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With that edit it seems to suggest that it falls back to "object" dtype (as we currently use)? But it still has a StringDtype, just using numpy instead of pyarrow under the hood. So it is "backed by NumPy object dtype under the hood", so something like:

Suggested change
NumPy). This means that pandas will start inferring columns containing string
being backed by NumPy ``object``-dtype). This means that pandas will start inferring columns containing string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup that edit is clearer. Just didn't want to make this ambiguous to suggest it could be the new NumPy string type (which it isnt)

how pandas operates with respect to copies and views. A summary of the changes:

1. The result of *any* indexing operation (subsetting a DataFrame or Series in any way,
i.e. including accessing a DataFrame column as a Series) or any method returning a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
i.e. including accessing a DataFrame column as a Series) or any method returning a
e.g. accessing a DataFrame column as a Series) or any method returning a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants